Diagonalwise Refactorization: An Efficient Training Method for Depthwise Convolutions

نویسندگان

Zheng Qin

Zhaoning Zhang

Dongsheng Li

Yiming Zhang

Yuxing Peng

چکیده

Depthwise convolutions provide significant performance benefits owing to the reduction in both parameters and mult-adds. However, training depthwise convolution layers with GPUs is slow in current deep learning frameworks because their implementations cannot fully utilize the GPU capacity. To address this problem, in this paper we present an efficient method (called diagonalwise refactorization) for accelerating the training of depthwise convolution layers. Our key idea is to rearrange the weight vectors of a depthwise convolution into a large diagonal weight matrix so as to convert the depthwise convolution into one single standard convolution, which is well supported by the cuDNN library that is highly-optimized for GPU computations. We have implemented our training method in five popular deep learning frameworks. Evaluation results show that our proposed method gains 15.4× training speedup on Darknet, 8.4× on Caffe, 5.4× on PyTorch, 3.5× on MXNet, and 1.4× on TensorFlow, compared to their original implementations of depthwise convolutions.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Merging and Evolution: Improving Convolutional Neural Networks for Mobile Applications

Compact neural networks are inclined to exploit “sparsely-connected” convolutions such as depthwise convolution and group convolution for employment in mobile applications. Compared with standard “fully-connected” convolutions, these convolutions are more computationally economical. However, “sparsely-connected” convolutions block the inter-group information exchange, which induces severe perfo...

متن کامل

QuickNet: Maximizing Efficiency and Efficacy in Deep Architectures

We present QuickNet, a fast and accurate network architecture that is both faster and significantly more accurate than other “fast” deep architectures like SqueezeNet. Furthermore, it uses less parameters than previous networks, making it more memory efficient. We do this by making two major modifications to the reference “Darknet” model (Redmon et al, 2015): 1) The use of depthwise separable c...

متن کامل

MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications

We present a class of efficient models called MobileNets for mobile and embedded vision applications. MobileNets are based on a streamlined architecture that uses depthwise separable convolutions to build light weight deep neural networks. We introduce two simple global hyperparameters that efficiently trade off between latency and accuracy. These hyper-parameters allow the model builder to cho...

متن کامل

Polynomials: a new tool for length reduction in binary discrete convolutions

Efficient handling of sparse data is a key challenge in Computer Science. Binary convolutions, such as polynomial multiplication or the Walsh Transform are a useful tool in many applications and are efficiently solved. In the last decade, several problems required efficient solution of sparse binary convolutions. Both randomized and deterministic algorithms were developed for efficiently comput...

متن کامل

An Efficient Algorithm for Workspace Generation of Delta Robot

Dimensional synthesis of a parallel robot may be the initial stage of its design process, which is usually carried out based on a required workspace. Since optimization of the links lengths of the robot for the workspace is usually done, the workspace computation process must be run numerous times. Hence, importance of the efficiency of the algorithm and the CPU time of the workspace computatio...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2018

Diagonalwise Refactorization: An Efficient Training Method for Depthwise Convolutions

نویسندگان

چکیده

منابع مشابه

Merging and Evolution: Improving Convolutional Neural Networks for Mobile Applications

QuickNet: Maximizing Efficiency and Efficacy in Deep Architectures

MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications

Polynomials: a new tool for length reduction in binary discrete convolutions

An Efficient Algorithm for Workspace Generation of Delta Robot

عنوان ژورنال:

اشتراک گذاری